Introduction
Use your own audio recordings instead of text-to-speech for complete control over voice, timing, and delivery. Perfect for professional voice-overs, branded audio, or languages not supported by TTS.Key Features
Custom Voice
Use professional voice talent recordings
Full Control
Control timing, tone, and delivery
Brand Voice
Maintain consistent brand audio identity
Any Language
Use audio in any language or dialect
When to Use: Best for professional voice-overs, branded audio, dialects not supported by TTS, or when you need precise control over delivery.
Quick Start
Related API Endpoints
| Endpoint | Purpose | Documentation |
|---|---|---|
POST /upload/asset | Upload audio file | API Reference |
POST /create_video_from_avatar | Create video with audio | API Reference |
GET /avatar_video/{id} | Check video status | API Reference |
Key Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
voice.type | string | ✅ | Must be “audio” when using audio_url |
voice.audio_url | string | ✅ | URL of uploaded audio file |
voice.voice_id | string | ✅ | Voice ID (still required) |
avatar.avatar_id | integer | ✅ | Avatar ID |
avatar.avatar_type | integer | ✅ | 0=Public, 1=Custom |
aspect_ratio | string | ✅ | portrait/landscape/square |
screen_style | integer | ✅ | 1=Full screen, 2=Split screen, 3=Picture in picture |
Audio Requirements
Supported Formats:- MP3 (recommended)
- WAV
- M4A
- Max size: 20MB
- Max duration: 10 minutes
- Recommended bitrate: 192 kbps for music, 128 kbps for voice
- Sample rate: 44.1 kHz
Code Examples
Step 1: Get signed URL for upload
Step 2: Upload your file using PUT
Step 3: Create video with the audio
Save the
asset_url from upload response to use as voice.audio_url. Video length will automatically match audio duration.Step 4: Check Video Status
Poll to check if video is ready:Use Case Examples
Professional Voice-Overs
Professional Voice-Overs
Use recordings from professional voice talent:
- Record in professional studio
- Upload final edited audio
- Create videos with consistent voice
- Maintain professional quality
Brand Voice
Brand Voice
Maintain branded audio across videos:
- Use company spokesperson voice
- Record once, use in multiple videos
- Consistent brand audio identity
- Scale branded content easily
Multiple Takes
Multiple Takes
Use best take from multiple recordings:
- Record several versions
- Upload the best performance
- Edit audio before uploading
- Perfect timing and delivery
Unsupported Languages
Unsupported Languages
Use audio in languages not supported by TTS:
- Record in any language or dialect
- Upload custom audio
- Create videos with native speakers
- Reach diverse audiences

